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SIGNIFICANCE  AND  EXPLANATION 


Penalty  function  minimization  is  a  useful  technique  for  converting 
constrained  optimization  problems  to  simpler  unconstrained  optimization 
problems.  One  difficulty  with  this  approach  has  been  the  determination 
of  the  size  of  an  adequate  penalty  parameter.^In  this  work  we  shows^ow 
to  choose  precisely  the  penalty  parameter  in  order  to  meet  any  preassigned 

/* — " — au  *■  u .a  i- 

accuracy.  In  addition  we  use  penalty  functions^to  obtain  bounds  on  the 
size  of  a  solution  of  a  constrained  optimization  problem  without  solving 
it.  We  also  show^ow  eur  results  can  be  used  to  solve  huge  sparse  linear 

programs  to  any  desired  degree  of  accuracy. 

f\~ 


The  responsibility  for  the  wording  and  views  expressed  in  this  descriptive 
summary  lies  with  MRC,  and  not  with  the  author  of  this  report. 


SOME  APPLICATIONS  OF  PENALTY  FUNCTIONS 
IN  MATHEMATICAL  PROGRAMMING 

0.  L.  Mangasarian 

I .  Introduction 

We  consider  in  this  work  the  constrained  minimization  problem 

(1 .1 )  min  f(x),  X:  =  Xn  n  X, 

xeX  U  1 

where  XQ  and  X-j  are  subsets  of  the  n-dimensional  real  space  Rn  which 
have  a  nonempty  intersection  X,  and  f:  Xq  R.  Associated  with  the 
above  problem  is  the  classical  exterior  penalty  problem  [3,2,1] 

(1.2)  min  P(x,a):=  f(x)  +  aQ(x) 

xcXq 

where  a  is  in  R+,  the  nonnegative  real  line,  and  Q(x):  XQ  -*■  R+  such 
that  Q(x)  =  0  for  xeX,  else  Q(x)  >  0.  We  have  two  principal  applica¬ 
tions  in  mind  regarding  the  penalty  problem  (1.2).  The  first  application, 
which  employes  in  addition  to  (1.2)  the  recent  boundedness  and  existence 
results  for  monotone  complementari ty  problems  [10]  and  which  is  described 
in  Section  3  of  the  paper,  gives  existence  and  boundedness  results  for  a 
convex  program  obtained  from  (1.1)  and  the  associated  dual  problem.  In 
particular  we  show  in  Theorem  3.1  that  if  there  exists  a  point  which  is 
feasible  for  a  primal  convex  program  and  is  interior  to  the  constraints  of 
its  Wolfe  dual  [12,5],  then  the  primal  problem  has  a  solution  which  is 
easily  bounded  in  terms  of  the  feasible  point,  and  that  there  is  no  duality 
gap  between  the  primal  problem  and  its  Wolfe  dual.  Theorem  3.2  shows  that 
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if  there  is  a  point  which  is  interior  to  the  constraints  of  a  primal  convex 
program  which  is  also  feasible  for  the  associated  Wolfe  dual,  then  the 
Lagrangian  dual  [4,1]  of  the  convex  program  has  a  nonempty  solution  set 
which  is  easily  bounded  in  terms  of  the  feasible  point,  and  in  addition 
there  is  no  duality  gap  between  the  primal  problem  and  its  Lagrangian  dual. 

In  Section  4  our  main  concern  is  the  recasting  by  means  of  an  exterior 
penalty  function  of  the  standard  linear  programming  problem  as  a  quadratic 
minimization  problem  on  the  nonnegative  orthant  in  the  spirit  of  previous 
work  [6,7,8].  The  principal  new  result  here  is  to  show  how  to  obtain  a 
precise  value  of  the  penalty  parameter  which  allows  us  to  satisfy  the 
Karush-Kuhn-Tucker  optimality  conditions  [5]  for  the  linear  program  to  any 
preassigned  degree  of  precision.  Theorem  4.1  shows  that  this  can  be  done 
by  minimizing  a  convex  quadratic  function  on  the  nonnegative  orthant  for 
only  two  values  of  the  penalty  parameter.  Iterative  methods  developed 
in  [6,7,8]  can  solve  by  this  approach  very  large  sparse  linear  programs 
which  cannot  be  solved  by  a  standard  linear  programming  simplex  package  [8]. 

Because  of  the  key  role  played  by  exterior  penalty  functions  in  this 
work,  we  give  in  Section  2  some  fundamental  results  regarding  these  functions 
in  a  form  convenient  for  deriving  our  other  results.  Although  some  of  these 
penalty  results  are  known  under  more  restrictive  conditions  [3,2],  some  are 
new.  For  example.  Theorem  2.3  shows  that  by  solving  only  two  exterior 
penalty  function  minimization  problems,  we  can  obtain  an  optimal  point  which 
is  feasible  to  any  preassigned  feasibility  tolerance.  Theorem  2.8  shows 
that  under  rather  mild  assumptions  each  accumulation  point  of  a  sequence  of 
solutions  of  penalty  functions,  corresponding  to  an  increasing  unbounded 
sequence  of  positive  numbers,  solves  the  associated  constrained  optimization 


problem.  Furthermore  the  corresponding  sequence  of  products  of  the 
penalty  parameter  and  the  penalty  term  tends  to  zero. 

We  briefly  describe  our  notation  now.  Vectors  will  be  column  or  row 
vectors  depending  on  the  context.  For  a  vector  x  in  the  n-dimenslonal  real 
space  Rn,  (|xf|  will  denote  an  arbitrary  norm,  while  ||x||p  will  denote  the 

„  i 

p -norm  ||x|L:*  (  l  | x .  | p )K  for  1  <  p  <  «  and  ||x||  :*  max  |x.|,  where  x. 

p  1*1  1  l<1<n  1  1 

is  the  1-th  component  of  x;  x+  will  denote  the  vector  In  Rn  with  compo¬ 
nents  (x+)i  *  max  {x^.O},  1*1,..., n.  A  vector  of  ones  in  any  real  space 
will  be  denoted  by  e.  For  a  differentiable  function  L:  RnxRm-*-R,  VxL(x,u) 
will  denote  the  n-dlmenslonal  gradient  vector  ^~_Cx,u ),  i«l,...,n,  while 
for  f:  Rn  ■+■  R,  Vf(x)  will  denote  the  n-dimenslonal  gradient  vector.  The 
set  of  vectors  in  Rn  with  nonnegative  components  will  be  denoted  by  R”. 


We  collect  In  this  section  some  fundamental  properties  of  exterior 
penalty  functions  in  a  form  convenient  for  our  applications  and  under  more 
general  assumptions  than  usually  given  [3,1].  We  begin  with  some  elementary 
but  important  monotonicity  properties  for  solutions  of  penalty  problems. 

2.1  Proposition  Let  x,  « X„  be  a  solution  of  min  P(x,a,)  for  1*0,2 

°  *cX0 

with  <*2  >  Oj  >  0.  Then 

(2.1)  Q(x2)<Q(x1),  f(x-,)  <  f(x2),  P(xlta,)  £  Pfxg.c^) 

Proof  Addition  of  P^.a,)  <  P^.a,)  and  P(xi,0|)  <  P(x2>c^),  gives, 
together  with  >  a^,  the  Inequality  Q(x2)  <  Q(x-|),  which  In  turn 
together  with  Pl^,^)  <  Pfx^),  and  cij  >  0,  gives  f(x^  <  f(x2). 

We  also  have  that 

PCX] .o^ )  <  P(x2,a,)  <  PtXg.Og)  □ 

2.2  Proposition  Let  Inf  f(x)  >  -<*>,  let  a  >  0  and  let  x(a)  e  XQ  be  such 

X£X 

that  P(x(a),a)  »  min  P(x,a).  Then 
xcXq 

(2.2)  f (x(a))  <  inf  f(x) 

”  XeX 

If  x(a) €  X  then 

(2.3)  f(x(a))  =  min  f(x) 

XeX 

Proof  For  any  e  >  0  pick  x(e) e  X  such  that 


f(x(c))  <  Inf  f(x)  +  e 


e  +  Inf  f(x)  >  f(x(e))  ■  P(x(e),a)  >  P(x(ot),a)  >  f(x(a)) 

XeX 


Since  x(a)  does  not  depend  on  e.  (2.2)  follows  by  letting  e  approach 
zero.  If  x(a)  Is  also  In  X,  then  (2.3)  Is  obviously  a  consequence  of 
(2.2).  □ 


The  following  simple  theorem  shows  how.  for  any  desired  feasibility 

tolerance  6  >  0,  solving  the  penalty  problem  (1.2)  for  only  two  values  of 

the  penalty  parameter  a  will  yield  a  point  x2  e  XQ  such  that  Q(x2)  £6 

and  f(x,)  <  Inf  f(x).  Hence  If  6  chosen  sufficiently  small,  x«  Is  an 
c  "  x«X 

approximately  feasible  optimal  solution  for  the  minimization  problem  (1.1). 


2.3  Theorem  Let  6  >  0,  a,  >0,  let  Inf  f(x)  >  let  xcX  and  let 

1  xeX 

P(x,,a,)  *  min  PU.a,).  If  f($)  <  f(x,)  then  x  solves  min  f(x),  else 
1  '  xeXQ  1  1  xeX 

for 

f(x)-  f(x,) 

(2.4)  a2  >  °h  and  a2  - - 5 - ~ 


It  follows  that 


(2.5)  x2  «  X0,  Q(x2)  <  5,  f(x2)  <  Inf  f(x) 

where 


Pfx^Oj)  *  min  Ptx.c^).  x2  e  Xg 

X€Xg 


Proof  First  note  that  if  f(x)  <  f(x,)  then  by  (2.2)  x  solves  min  f(x). 

xeX 

Suppose  now  that  f(x)  >  f(x-j)  and  (2.4)  holds.  Then 


-6- 


(2.6)  f(x2)  +  a2Q(x2)  <  f(x)  +  a2Q(x)  *  f(x) 

Hence  by  (2.4),  (2.1)  and  (2.6)  respectively  it  follows  that 

f(x)-f(x,)  f(x)-f(x?) 

6  >  - —  >  - —  >  0(x«) 


which  establishes  the  first  inequality  of  (2.5).  The  second  inequality  of 
(2.5)  follows  from  (2.2).  □ 


2.4  Remark  Theorem  2.3  can  be  applied  to  obtain  an  approximate  solution  of 
(1.1)  in  the  sense  of  (2.5)  as  follows: 


(a)  Choose  S>0,  Oj>0,  xeX. 

(b)  Compute  yip  such  that:  P(x1,a1 )  *  min  P(x,o1 ).  If  f(x)<f(x,), 

XeX„ 


f(x)  -  f(x1 ) 


stop,  x  solves  (1.1). 

(c)  Choose  a2  such  that  >  “l  an<1  °2  - - 5 - 

(d)  Compute  x? £ XQ  such  that:  P(x?,aJ  =  min  P(x,cu). 

XeX0 

If  a2  of  step  (c)  is  too  large,  an  ij  such  that  Oj  <  a-j  <  can  be  chosen 
to  replace  and  steps  (a )- (b)- (c )  are  repeated.  Also  x  may  be  replaced 
when  possible  by  some  xc[x,x^]nX  such  that  f(x)<f(x). 

The  next  result  shows  that  for  a  sequence  of  solutions  (x . >  of  the 
penalty  problem  (1.2)  for  an  increasing  unbounded  sequence  of  penalty  para¬ 
meters  (a. },  the  sequence  of  penalties  {Q(x. )}  converges  to  0  and  the 

sequence  (f(x. )}  converges  to  a  lower  bound  for  inf  f(x),  provided  the 

xeX 


latter  is  finite.  We  do  not  require  that  the  sequence  (x^ }  have  an 
accumulation  point  here. 


2.5  Theorem  Let  inf  f(x)  >  let  {a.}  be  an  increasing  unbounded 
x«X  1 

sequence  of  positive  numbers,  let  {x^}  be  a  corresponding  sequence  of 

points  in  Xn  not  in  X  such  that  P(x.,aJ  ■  min  P(x,aJ. 

X£Xq 


(2.7) 


-7- 


11m  Q(x4)  ■  0  and  lira  f(x.)  <  Inf  f(x). 

{■MB  1  I-**  xeX 

Proof  By  (2.1),  the  sequence  (QCx1 )>  Is  nonincreasing  and  bounded  below  by 

0  and  hence  converges  to  Q  £0  and  Q(x^)  £  Q,  1*1,2,...  .  If  Q  >  0  we  get 

from  (2.5)  by  picking  1  sufficiently  large  such  that  >  2(f(S)  -  f(xf  ))/Q 

where  8«X,  that  which  Is  a  contradiction.  Hence  Q*0 

and  lira  Q(x.)  *  0.  Now  again  by  (2.1),  the  sequence  (f(x1)}  Is  nondecreas- 
1'*€0 

Ing,  and  by  (2.2)  It  Is  bounded  above  by  Inf  f(x).  Hence  (f(x^)}  converges 

xeX 

to  f  and 

f(xt)<f<1nff(x)  □ 

1  xeX 

To  make  the  Inequality  In  (2.7)  an  equality  we  need  additional  assump¬ 
tions  such  as  those  given  In  the  following  corollary. 

2.6  Corollary  If  In  addition  to  the  assumptions  of  Theorem  2.5,  f  Is 
llpschltz  continuous  on  XQ,  that  Is  for  some  K  >  0 

(2.8)  1  f  (y)  -  f  (x) |  <  K||y  -  x  1| 2  for  all  x,  yeX0 

and  there  exists  a  constant  y  >  0  such  that  for  each  x«XQ  there  exists 
an  x(x) e  X  such  that 

(2.9)  ||x-x(x)||2  <  vQ(x) 
then 

(2.10)  Hi"  f(x,)  ■  Inf  f(x) 

1-XO  xcX 

Proof  For  each  x^  there  exists  an  x^  e  X  such  that 


.  .  . .  . 


Hence  by  (2.8)  and  (2.9) 


(2.11)  0  <  |f (xi )  -  f (x.)l  <  K||x.  -  x1 1|2  <  KvQ(xt) 

Since  by  (2.7)  11m  Q(x.)  *  0,  It  follows  from  (2.11)  that 
1-*»  1 

(2.12)  11m  f(x.)  =  11m  f(x.) 

1-xo  1  -*<*> 

From  (2.11)  and  x^ e  X  we  get  the  Inequalities 

f(x. )  +  KyQ(x.)  >  f(x.)  >,  Inf  f(x) 

1  1  1  xeX 

Taking  the  limit  as  1  and  using  (2.7)  gives 

Inf  f(x)  >  11m  f(Xj)  >  Inf  f(x) 

XeX  ”  1-"»  “  XeX 

Hence  11m  f(x. )  ■  Inf  f(x).  □ 

1-«  1  xeX 

2.7  Remark  Condition  (2.9)  Is  satisfied  if  the  feasible  region  X  is  convex 
and  satisfies  an  appropriate  constraint  qualification  [9,  Theorem  2.1].  In 
particular  (2.9)  holds  In  the  special  case  when  Xq  *  Rn  and  X^  is  defined 
by  linear  Inequalities  [9,  Theorem  2.1]. 

Ue  observe  that  in  both  Theorem  2.5  and  Corollary  2.6  the  sequence  {x^} 
need  not  have  an  accumulation  point.  A  stronger  result  is  obtained  If  (x.) 
has  an  accumulation  point. 

2.8  Theorem  Let  inf  >  and  let  {a..}  be  an  increasing  unbounded 

xeX  1 

sequence  of  positive  numbers.  Let  (x^>  be  a  corresponding  sequence  of  points 
in  X„  not  In  X  such  that  P(x.,a^)  *  min  P(x,a<)  with  an  accumulation  point 

0  1  ««*o 


x  e  Xq.  If  f  and  Q  are  lower  serai continuous  at  x,  then  Q(x)  ■  0  and 

x  solves  min  f(x).  Furthermore 
xeX 


(2.13) 

Proof  Let  x, 

1  j 


11m  a*  Q(x4  )  «  0  for  x4  ♦  x e Xft. 

j—  U  0 

x  e  Xg.  From  (2.7)  and  the  Isc  of  Q  we  have 

0  «  11m  Q(x,  )  >  Q(x)  >  0 

j—  '1 


Hence  Q(x)  ■  0  and  x e  X.  From  (2.7)  and  the  lsc  of  f  we  have 


f(x)  <  11m  f(x.  )  <  inf  f(x) 

“  j-*°°  1  j  xeX 

Since  x  e  X,  It  follows  that  x  solves  min  f(x).  To  establish  (2.13) 

xeX 

note  that 

0  >  P(x,  ,a.  )  -  Pfx.a,  )  ■  f(x,  )  -  f(x)  +  a.  Q(x,  ) 

^  ^  ’j  'i  ’j  ’J 


Hence 


f(x)  -  f(x,  )  >  a,  Q(x^  )  >  0 

’j  J  i 


By  letting  j-+-«  and  recalling  that  f  Is  lsc  at  x  it  follows  that 


Ij 


3.  Bounds  and  Existence  for  Dual  Convex  Programs 

We  consider  in  this  section  the  convex  primal  program 


(3.1) 


min  f(x),  X  =  {x|xeR",  g(x)  <  0} 
XeX 


where  f:  Rn  -►  R,  g:  Rn  -*■  Rra  are  differentiable  and  convex  on  Rn.  The 
Wolfe  dual  [12,5]  associated  with  this  problem  is 


(3.2)  max  L(x,u)  -  vx,  Y  *  {(x,u,v) 
(x,u,v)eY 


x  c  Rn,  ucRj,  vcRJ, 


7xL(x,u)  -  v  *  0} 


and  the  Lagrangian  dual  [4,1]  is 


(3.3) 


max  inf  L(x,u)  -  vx 
(u,v)>0  xcRn 


where  L(x,u): 
equivalent  to 


f(x)  +  ug(x)  is  the  usual  Lagrangian.  Note  that  (3.2)  is 


(3.2')  max  L(x,u)  -  xVL(x,u),  Z  *  { (x,u ) 
(x,u )eZ  X 


x  e  Rn,  u  e  R® 


7xL(x,u)>0} 


Note  that  (3.1)  can  be  identified  with  problem  (1.1)  by  setting  XQ  =  r” 
and  X^  »  {x|g(x)£0}. 


Our  primary  objective  here  is  to  give  simple  conditions  for  the  separate 
existence  of  a  solution  to  each  of  primal  and  Lagrangian  dual  problems  and  to 
bound  their  solutions.  Loosely  speaking  we  shall  establish  existence  of  a 
solution  and  a  bound  for  the  primal  (Lagrangian  dual)  problem  under  a  primal 
and  Wolfe-dual  feasibility  assumption  together  with  a  Wolfe-dual  (primal) 
constraint  interiority  assumption.  Our  principal  tools  will  be  the  recent 


boundedness  and  existence  results  for  monotone  complementarity  problems  and 
convex  programs  of  [10]  and  the  penalty  function  results  outlined  In  the 
previous  section.  We  begin  with  an  existence  and  boundedness  result  for  the 
primal  problem  (3.1). 


3.1  Theorem  (Primal  feasibility  &  Wolfe  dual  Interior-feasibility  «■*  Primal 
solution  existence-boundedness  &  zero  duality  gap  with  Wolfe  dual)  Let  f 
and  g  be  differentiable  and  convex  on  Rn  and  let  (x,G)  satisfy 


x  «  X,  (x,u) c  Z,  VxL(x,u)  >  0 


Then  there  exists  a  primal  optimal  solution  x  to  (3.1)  which  Is  bounded  by 


(3.4) 


-Gg(S)  +  xVxL(x,u) 

X|,l  -  min  (V  L(S,G)h 
1  x  1 


In  addition  there  exists  no  duality  gap  between  the  primal  problem  (3.1)  and 
the  Wolfe  dual  (3.2),  that  Is: 


(3.5)  min  f(x)  »  f (x)  ■  sup  L(x,u)  -  vx 

XeX  (x,u,v)eY 

Proof  Consider  the  penalty  function  problem  associated  with  (3.1) 


(3.6) 

or  equivalently 


min  f(x)  +  oteg(x) 
x>0 


(3.6')  min  f(x)  +  aez  s.t.  g(x)  -  z  <  0 

(x,z)>0 


The  Wolfe  dual  associated  with  (3.6')  Is 


-V.%  V 


(3.7)  max  L(x,u)  +  z  (ae  -  u  -  w)  -  vx 

(x,u,v,w) 

s.t.  VxL(x,u)  -  v  «  0,  ae  -  u  -  w  =  0,  u,  v,  w  >  0 
which  is  equivalent  to 

(3.7')  max  L(x,u)  -  xV„L(x,u)  s.t.  V  L(x,u)  >0,  ae  >  u  >  0 

(X.u)  x  x 

Note  that  the  only  difference  between  (3.7')  and  (3.2')  is  the  constraint 
ae  >  u.  Now,  for  any  e  >  0,  the  point  (x,  z:»  ee,  u)  satisfies  a 
"Slater"  constraint  qualification  for  the  dual  problems  (3.6* )-(3.7‘ )  for 
a  >  HulL.  Hence  these  probems  have  equal  extrema  and  a  solution 
(x(a),  z(a),  u(a))  such  that  x(a)  is  bounded  by  [10,  Theorem  2.3] 


(3.8) 


l|x(a)||7  < 


u(-g(x)  +  ce)  +  x7  L(x,u)  +  ee(ae  -  u) 


min  (V  L(x,u)). 
i  x  1 


Since  the  left  side  of  (3.8)  does  not  depend  on  e,  we  can  let  e  -*■  0  in 
(3.8)  and  we  have 


(3.9) 


-ug(x)  +  xV  L(x,u) 

|{ x (a )  ||,  < - ^ - 

min  (VxL(x,u))i 
i 


Note  now  that  by  the  weak  duality  theorem  [5]  applied  to  (3.1)  and  (3.3)  we 


inf  f(x)  >  L(x,u)  -  xV  L(x,u)  >  -® 
xeX  ”  x 

Hence  for  an  unbounded  increasing  sequence  of  positive  numbers  (a.)  exceed- 
ing  ||u|L,  it  follows  [10,  Theorem  2.3]  that  there  exists  a  sequence  of 


*. 


\  \  *,  .V  *.  S  \  \  N 


.. . 


points  {x(a^),  u(a^)}  with  x(c^)  bounded  as  In  (3.9),  such  that  each  x(ot^) 

solves  the  penalty  function  problem  (3.6)  with  <x  ■  and  (x(c^ ),  u(a^ )) 

solves  Its  dual  (3. 7').  Since  {x(ot^}}  Is  bounded  It  has  an  accumulation 

point  x  which  Is  bounded  by  (3.9).  Since  ez(o^)  ■  e(g(x(o^)))+  Is  the  penalty 

term  for  (3.6' ),  It  follows  by  Theorem  (2.8)  that  ez  ■  eg(x)+  ■  0,  that  x 

solves  min  f(x)  and  that 
xeX 

(3.10)  11m  o.  ez(a,  )  ■  11m  a.  e(g(x(a4  )))  ■  0  for  x(a>  )  x 

j-  U  +  U 

Now  we  establish  the  zero  duality  gap.  Let  {e^}  be  any  decreasing  sequence 
of  positive  numbers  converging  to  0  and  let  {ct^}  be  an  unbounded  Increasing 
sequence  of  positive  numbers  chosen  as  follows: 

•  >  sup  l(x,u)  -  xV  L(x,u)  -  e4 
(x,u)eZ  x  1 

(By  weak  duality  theorem) 

<  L(x(ef).  u(e^))  -  x(e1)VxL(x(ei),  u(tj)) 

(For  some  (x(e^),  u(e^))  e  Z,  by  definition  of  sup) 

<  Kxfof),  ufotj))  -  x(af)VxL(x(aj),  0(0^)) 

(For  sufficiently  large  s.t.  ||u(e^ ) 
because  (x(a^ ),  u(a^))  solves  max  l(x,u)  -  xVxL(x,u) 
s.t.  VxL(x,u)  >  0,  a^e  >  u  >  0) 

*  fixing)  +  a^ez(a^) 

(By  equality  of  primal -dual  optimal  objective  func¬ 
tions  of  problems  (3.6')  and  (3.7')  with  a  ■  0^ ) 

■  sup  (l(x,u)  -  xVvL(x,u)  V  L(x,u) >  0,  cije>u>0} 

(x.u)  x  1  x  1  ■  " 

<  sup  L(x,u)  -  xV  l(x,u) 

(x.u)cZ  X 


:•>  .‘-j 

;:^‘A 


y-  --rJ 

V-Vy.-l 


.y-y 


*•*  %' 
\\*w' 


.  *,■,*•  «■  ,*•  .*•  t 


r  V*. 
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Since  by  (3.10),  lim  a*  ez(.cu  )  ■  0  for  x(a4  )  -*■  x,  it  follows  that 
j—  '6  ’j 

sup  L(x,u)  -  xV  L(x,u)  ■  f(x)  *  min  f(x)  Q 

(x,u)«Z  X  XeX 

We  establish  now  an  existence  and  boundedness  result  for  the  Lagrangian 
dual  problem  (3.3). 

3.2  Theorem  (Wolfe-dual  feasibility  &  primal  Interior-feasibility 
Lagrangian  dual  solution  existence-boundedness  &  zero  duality  gap  with  primal) 
Let  f  and  g  be  differentiable  and  convex  on  Rn  and  let  (x,u)  satisfy: 


(3111)  x  e  X,  (x,u)  e Z,  x  >  0,  g (x )  <  0 


There  exists  a  dual  optimal  solution  (u,v)  to  the  Lagrangian  dual  (3.3) 
which  is  bounded  by 


(3.12) 


113*11, 


-ug(x)  +xV  L(x,u) 

<  - 5 - 

”  min  {-gj(x),  Xj} 

1 » J 


In  addition  there  Is  no  duality  gap  between  the  primal  problem  (3.1)  and  the 
Lagrangian  dual  (3.3),  that  is: 


(3.13) 


inf  f(x)  *  max  inf  L(x,u)  -  xv 
x«X  (u,v)^0  xeRn 


Proof  For  $  >  0  consider  the  bounded  version  of  (3.1) 


(3.14)  min  f(x)  s.t.  g(x)  <  0,  Be  >  x  >  0 


and  its  Wolfe  dual 


-15- 


(3.15)  max  L(x,u)  -  vx  +  w(x-8e) 

(x.u.v.w) 

s.t.  7xL(x,u)  -  v  +  w  ■  0,  u,  v,  w  >  0 

or  equivalently 

(3.15')  max  t(x,u)  -  x7  L(x,u)  -  Bew 

(x.u.w) 

s.t.  7xL(x,u)  +  w  >  0,  u,w  >  0 
which  again  Is  equivalent  to 

(3.15")  max  L(x,u)  -  x7  L(x,u)  -  8e(-7  L(x,u))+ 

(x.u)  x  x  + 

u>0 

which  Is  nothing  other  than  an  exterior  penalty  function  formulation  for  the 
Wolfe  dual  (3.2')  with  penalty  parameter  6.  Thus  the  bound  B  on  the  «»-norm 
of  the  primal  variable  x  becomes  a  penalty  parameter  on  the  Wolfe  dual. 

Now  for  any  e  >  0,  the  point 

(X,  u,  w:-  ee) 

satlfles  a  Slater  constraint  qualification  for  the  dual  problems  (3.14)-(3.15‘ ) 
for  B  >  NXII,.  Hence  [10,  Theorem  2.3]  there  exists  (x(B),  u(B),  v(B),  w(B)) 
which  solves  the  dual  problems  <3.14)-(3.15)  with  equal  extrema.  For  any  such 
solution,  (u(8),v(8))  Is  bounded  by  [10,  Theorem  2.2] 

-ug(x)  +  Beee  +  x7J.(x,u) 

(3.16)  ||u(B).v(0)||1  <  - - - 

min  C-gi(x),  x^} 

^ » j 

Since  the  left  side  of  (3.16)  does  not  depend  on  c,  we  can  let  e  •*  0  In 

(3.16)  and  we  have 


v.v  %  .  V  V 


(3.17) 


Define  now 


(3.18) 


(3.19) 


(3.20) 


-ug(x)  +  xV  L(x,u) 

l|u(0),  v(0)|| 

1  min  (-g^X),  Xj} 
i » J 


<j>(u,v):«  inf  L(x,u)  -  vx 
x«Rn 

iji(u,v,w):*  inf  L(x,u)  -  vx  +wx 
xcRn 

<J>  (u,v)  =  ij>(u,v,0) 


Note  now  that  by  the  weak  duality  theorem  [5] 

«  >  f (x )  sup  L(x,u)  -  x7  L(x,u) 

(x,u)eZ  X 

Hence  for  an  unbounded  increasing  sequence  of  positive  numbers  {£^}  exceed¬ 
ing  ||*||..  it  follows  [10,  Theorem  2.3]  that  there  exists  a  sequence  of 
points  (x^),  u(6j),  v(Bj),  w(0j)}  which  solve  the  dual  pair  (3.14)-(3.15) 
for  8  *  8^,  giving  equal  extrema  and  such  that  {u(6j ),  v(8^ )>  is  bounded  by 
(3.17).  Since  ew(8i)  ■  e(-7xL(x(6j ),  u(8j )))+  constitutes  the  penalty  term 
for  (3.15"),  it  follows  by  (2.7)  that  (ew(B^)}  converges  to  zero  and  since 
w(B.j)  >  0,  it  follows  that  (w^))  also  converges  to  w  »  0.  Let  (u.v.O) 
be  an  accumulation  point  of  the  bounded  sequence  (u(B^),  v(B^).  w(B^)}.  Now 
we  have 


c:=  L(x,u)  - 


x7  L(x,u)  <  inf  f (x ) 
xeX 


(By  weak  duality) 


<  f(x(8i ))  (Since  x ( Bi )  «  X) 

<  L(x(B.),  u{&.))  -  v(B1-)x(S^)  +  w(B^)x(B1-) 

(Since  u(Bi)g(x(B.))  3  0,  v(Bi)x(Bi)  =  0  and  w(B1-)x(Bi)  >0) 


■>: v.v.v 


•X-VC'.V-V'.'V-' 


•/  -  r "  .  «  .  •  .  •  .  *  .  *  ,  • 
V  w  vsA-  • . '  «•  •• 


V-"' 


v*.  s 
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In  the  limit  we  have 


<  L(x,  u(B1 ))  -  v(ej)x  +  wtB^x  Vx«Rn 

(Since  VxL(x(8f),u(ef))-v(^)+w(Bf)  -  0 
L(x,  u(6^))  -  v(Bj)x +w(8^)x  Is  convex  In  x) 

c  <  L(x,u)  -  vx  +  wx  Vx  e  Rn 


and  so 


c  <  Inf  L(x,u)  -  vx  +  wx  ■  4/(u,v,w)  ■  4>(u,v) 
x«Rn 

Since  \|»(u,v,w)  Is  finite.  It  follows  by  Theorem  A.l  of  the  Appendix,  that 
ip( u.v.w)  Is  upper  semlcontlnuous  at  (u,v,w)  with  respect  to  R+  .  Now 

let  {ej}  +  0.  It  follows  by  the  upper  semicontinuity  of  il>(u,v,w)  at 
(u.v.w)  that  there  exists  a  subsequence  {8*  }  +  «  of  the  unbounded  Increas- 

Ing  sequence  (8*)  such  that  {u(8.  ),v(8,  ),w(8.  )}  converges  to 

...  *  J  J 

(u.v.w »  0)  and 

(3.21)  <fr(u,v)  +  -  ^(u.v.w)  +  Cj 

>  ^(u(6,  ),  v(6,  ),  w(8.  )) 

’j  'j 

(By  use  of  \p  at  (u,v,w)) 

*  Inf  L(x,u(8j  ))-v(8,  )x+w(8,  )x 
x  ’j  ’j  ’j 

(By  definition  of  ip) 


L(x(81  ),u(8,  ))-v(81  )x(8,  )  +  w(8/  )x(8,  ) 

(Since  x(8,  )  minimizes  L(x,u(64  ))-v(64  \X+w(S 4  )x) 

’j  ^  V  U 


'**•*,•*»•  *  •  *  •'  *  ,*•  ,*»  /»  , 
*•  *•  ‘  m’  '/  '  -  *  •  '  •  * 

j*  *  »  '  m  n  •  V.  . 


•  4  >*’  ,*»  %  N  , 

>  .  «,  V,/,  .*  ,*  ••  .* 


,•  .»  «’ 


•s: 
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V .  1'  .  I  .  W.  J  .  U",  l  V  L".  IF.  V 


>  f(x(6.  )) 

’j 

(Since  11(8,  )g(x(6.  ))  -  0,  v(B*  )x(B,  )-0 

and  w(8<  )x (e4  )>0) 

’j  ’j 

>  L(x(g.  ),u(B.  )}-vx(B.  )  for  (u,v)  >  0 

’j  ^ 

(Since  g(x(g.  )<0  and  x($.  )  > 0) 

’j  "  'i  “ 

>  <j>(u,v)  (By  definition  of  4) 


Note  that  for  (8-  }  +  »,  the  sequence  {f(x(8i  )))  of  minima  of  (3.14) 

'i 

with  8  “  64  .  constitutes  a  nonincreasing  sequence  bounded  below  by 
inf  f(x).  Hence  {f(x(8.-  )>  converges  and 

X£X 


(3.22)  Inf  f(x)  <  lim  f(x(B,  )) 

xeX  j-*«  'j 


Letting  £j  -+  0  in  the  string  of  inequalities  of  (3.21)  gives 


4>(u, v )  >  lim  f(x(84  ))  >  4>(u,v)  V(u,v)  >,  0 
j-**0  J 

Hence 


(3.23)  <j>(u,v)  a  lim  f(x(8.-  ))  *  max  $(u,v)  ■  max  inf  L(x,u)  -  vx 

i  'j  (u,v)>0  (u,v)>0  xcRn 


and  (G,v)  solves  the  Lagrangian  dual  problem  (3.3).  The  bound  (3.12)  on 
(u,v)  follows  from  (3.17).  To  show  a  zero  duality  gap,  just  note  that 


inf  f(x)  <  lim  f(x(8-  ))  =  max  <f>(u,v)  <,  inf  f(x) 

xeX  j-*»  1  j  (u,v)>0  *  XeX 


,»  V  V  ’ 
*  • 
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where  the  first  Inequality  follows  from  (3.22),  the  equality  from  (3.23) 
and  the  last  Inequality  from  the  weak  duality  theorem  for  the  Lagranglan 
dual  [4,1].  Hence 


Inf  f(x)  ■  max  <f>(u,v) 
x«X  (u,v)>0 


We  remark  that  the  existence  part  of  this  theorem  and  the  zero  duality 
gap  result  can  also  be  derived  as  a  consequence  of  the  strong  duality  theorem 
of  Lagranglan  duality  (e.g.  [4,  Theorem  3])  which  is  based  on  the  entirely 
different  argument  of  a  separating  hyperplane.  Our  explicit  bound  on  the 
dual  optimal  variables  (3.12)  however  does  not  follow  from  Lagranglan  duality 
and  Is  based  on  the  recent  boundedness  results  of  [10]. 


v  o  - 
■.v-v.' 


v.as 

*  «  '  o  J 


'  r  i:. 
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4.  Penalty  Functions  In  Linear  Proqrammln 


In  this  final  section  we  show  how  to  use  penalty  function  results  to 
determine  precisely  the  value  of  the  parameter  In  the  quadratic  perturbation 
to  a  linear  program  [6,7,8]  In  order  to  obtain  a  solution  to  the  perturbed 
problem  which  Is  dual  feasible  to  within  any  preassigned  tolerance.  This 
Is  a  practical  and  Important  Issue  which  has  not  been  completely  resolved 
before  In  the  Iterative  successive  overtaxation  (SOR)  methods  for  solving 
huge  sparse  linear  programs  [8]. 


We  consider  the  primal  linear  program 


(4.1) 


max  cx  s.t.  Ax  <_  b,  x  £  0 

x  — 


where  A  Is  given  mxn  real  matrix,  c«Rn  and  bcRm,  and  Its  dual 


(4.2) 


min  bu  s.t.  v  -  fJu  •  c,  u,v  >  0 
tu.v)  ” 


In  [8]  it  has  been  shown  that  perturbed  primal  program 


(4.3) 


max  cx  -  5-xx  s.t.  Ax  <  b,  x  >  0 

x  c  ”  “ 


is  solvable  for  all  ee  (0,e]  for  some  e  If  and  only  If  (4.1)  is  solvable 
In  which  case  the  unique  solution  x  of  (4.3)  for  ec(0,e]  Is  Independent 
of  e  and  Is  the  point  In  the  solution  set  of  (4.1)  with  least  2-norm.  If 
we  consider  the  Wolfe  dual  to  (4.3)  we  obtain 

(4.4)  min  bu  +  l-xx  s.t.  c  -  ex  -  ATu  +  v  *  0,  u,v  >  0 

(x,u,v) 

Elimination  of  x  through  the  constraint  relation 


(4.5) 


x  a  ^(-A^u  +  v  +  c) 


•  •  •  %  «  •  «  »  « 


•  *  «  •  m  .  *  ,  • 


%  ■  -  -V  %  **  .*V 


gives 


(4.6) 


min  bu  +  x-  ||-A^u  +  v  +  c| 
(u,v)>0  2e 


which  is  precisely  the  exterior  penalty  function  associated  with  the  dual 
linear  program  (4.2)  with  penalty  parameter  Using  standard  exterior 
penalty  function  results,  one  needs  that  e  0  In  order  for  solutions 
(u(e),v(e))  of  (4.6)  approach  a  solution  of  the  dual  linear  problem  (4.2). 
However  by  computing  x  from  (u(e),  v(e))  through  the  relation  (4.5),  it 
turns  out  [8]  that  for  ec  (0,e],  x  Is  independent  of  e  and  Is  the  unique 
point  in  the  solution  set  of  (4.1)  with  least  2-norm.  In  [8]  SOR  methods 
were  prescribed  for  solving  (4.6)  for  e  sufficiently  small  and  then  comput¬ 
ing  x  from  (4.5).  Very  large  sparse  problems  (n  ■  20,000,  m  ■  5,000)  were 
solved  by  this  technique,  without  knowing  what  e  Is,  but  merely  by  decreas¬ 
ing  e  until  certain  approximate  optimality  criteria  were  met.  We  would 
like  to  show  here  that  by  solving  the  penalty  problem  (4.6)  for  only  two 
values  of  e,  we  can  satisfy  the  Karush-Kuhn-Tucker  optimality  conditions 
for  the  linear  program  to  any  preassigned  tolerance.  In  fact  such  a  solution 
will  be  primal  feasible,  satisfy  the  complementarity  conditions  between  primal 
and  dual  linear  programs,  and  satisfy  dual  feasibility  to  any  required  toler¬ 
ance.  More  specifically  we  have  the  following. 

4*1  Theorem  Let  6  >  0,  e-j  >  0,  let  (u,v)  be  dual  feasible,  that  is 
v  »  ATu  -  c  >  0,  u  >  0,  and  let  (u(e^ ),  v(e^ ))  be  a  sol uti on  of  (4.6)  with 
e  a  E-j .  If  bu  <  bu(£j)  then  (u,v)  solves  the  dual  problem  (4.2),  else  for 


(4.7) 


Eg  <  e1  and  Eg  < 


bu  -  bu(E-j ) 


'  • 


•.  .'/Vv 


MvvIvJ 


it  follows  that 


(4.8)  y  ||-ATu(e2)  +v(e2)  +cll2  ^  <5,  bu(cg)  <  min  {bu|ATu  >  c,  u  >  0) 

where  (u(e2),  v(e2))  is  a  solution  of  (4.6)  with  e  ■  Eg.  Furthermore  for 
x(e2)  defined  by 

(4.9)  x(e2):»  ^-(-ATu(e2) +v(e2) +c) 

we  have  that  the  Karush-Kuhn-Tucker  conditions  for  the  linear  program  (4.1) 
are  satisfied  to  within  a  tolerance  6  as  follows 

/  x(e2)  >  0,  Ax(e2)  <  b,  u(e2)  >  0,  v(e2)  >  0 

(4.10)  /  u(e2)(b-  Ax(e2))  -  0,  v(e2)x(e2)  ■  0 

\  )|-ATu(e2)  +v(e2)  +c||2  <  (26)^ 


Proof  The  first  part  of  the  theorem,  (4.7)-(4.8),  follows  directly  from 
Theorem  2.3.  The  last  part  of  the  theorem  (4.10)  follows  from  (4.8)  and 
from  the  Karush-Kuhn-Tucker  optimality  conditions  for  (4.6)  with  e  »  Eg, 
that  Is 


(4.11) 


b  -  A(-ATu(e2)  +  v(e2)  +  c)  >  0,  u(e2)  >  0 
u(e)(b  -  ~  A(-ATu(e2)  +v(e2)  +  c)  =  0 

jr-  (-ATu(e2)  +  v(e2)  +c)  >  0,  v(e2)  >  0 


'2 
v(e9) 


(-ATu(e2)  +  v(e2) +  c)  *  0 


These  conditions  together  with  (4.8)  and  the  definition  (4.9)  imply  (4.10).  [ 


Appendix 


A. 7  Theorem  Let  t//(s):»  Inf  h(s,t)  where  h:  S  x  T  ■*  R,  $  +  S  c  r^, 

tcT 

^  t  T  <=  Rn  and  h  is  upper  semicontinuous  on  S  with  respect  to  S  for 
each  fixed  tcT.  Then  ip  is  upper  semicontinuous  with  respect  to  S  at 
each  SeS  for  which  if»(s)  > 


Proof  Suppose  i|>  is  not  use  at  s  with  respect  to  S.  Then 

(A. 7)  3e  >  0:  V6  >  0  9s(6)  <  S:  ||s(6)  -  3||  <  6,  *(«(*))- *(s)  >  e 


Let  e  be  fixed.  Since  -*  <  ip(s ) 
such  that 


inf  h(s,t),  there  exists  t(e)cT 
t«T 


(A. 2) 


h(5,  t(e))  <  Ms)  +  e 


Combining  (A. 7),  (A. 2)  and  the  definition  of  gives 


h(s,  t (e) )  <  ip(I )  +  e  <  ip(s (6 ) )  <  h(s(6),  t(e)) 

V5  >  0,  for  some  s(6)eS  such  that  ||s(5)  -  s||  <  5 


Since  h(s,  t(e))  Is  use  with  respect  to  S  at  s  c  S  we  have 

(A. 4)  iy  >  0,  3S(y)  >  0:  Vs  e  S  ||s  -  s||  <  <5(y),  h(s,  t(e))  <  h(s,  t (e) )  +y 

Combining  (A. 3)  and  (A. 4)  gives 

(A. 5)  h(s,  t(e))  <  Ms)  +  e  <  h(s,  t(e))  +  y  Vy  >  0 


Since  5  and  t  do  not  depend  on  y,  (A. 5)  gives  a  contradiction  by  letting 
y  approach  zero.  Hence  <1/  is  use  at  5  with  respect  to  S.  □ 
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